将包含文本和不同边缘类型的文本的信息节点连接的异质网络通常用于在各种现实世界应用程序中存储和处理信息。图形神经网络(GNNS)及其双曲线变体提供了一种有希望的方法,可以通过邻域聚集和分层特征提取在低维的潜在空间中编码此类网络。但是,这些方法通常忽略Metapath结构和可用的语义信息。此外,这些方法对训练数据中存在的噪声很敏感。为了解决这些局限性,在本文中,我们提出了富含文本的稀疏双曲图卷积网络(TESH-GCN),以使用语义信号捕获图形的Metapath结构,并进一步改善大型异质图中的预测。在TESH-GCN中,我们提取语义节点信息,该信息连接信号是从稀疏的双曲线图卷积层中从稀疏邻接张量中提取相关节点的局部邻域和图形级Metapath特征。这些提取的功能与语言模型的语义特征(用于鲁棒性)结合使用,用于最终下游任务。各种异质图数据集的实验表明,我们的模型在链接预测任务上的大幅度优于当前最新方法。我们还报告说,与现有的双曲线方法相比,训练时间和模型参数均减少了,通过重新的双曲线图卷积。此外,我们通过在图形结构和文本中使用不同级别的模拟噪声来说明模型的鲁棒性,并通过分析提取的Metapaths来解释Tesh-GCN的预测机制。
translated by 谷歌翻译
双曲线神经网络由于对几个图形问题的有希望的结果,包括节点分类和链接预测,因此最近引起了极大的关注。取得成功的主要原因是双曲空间在捕获图数据集的固有层次结构方面的有效性。但是,在非层次数据集方面,它们在概括,可伸缩性方面受到限制。在本文中,我们对双曲线网络进行了完全正交的观点。我们使用Poincar \'e磁盘对双曲线几何形状进行建模,并将其视为磁盘本身是原始的切线空间。这使我们能够用欧几里院近似替代非尺度的M \“ Obius Gyrovector操作,因此将整个双曲线模型简化为具有双曲线归一化功能的欧几里得模型。它仍然在Riemannian歧管中起作用,因此我们称其为伪poincar \'e框架。我们将非线性双曲线归一化应用于当前的最新均质和多关系图网络,与欧几里得和双曲线对应物相比,性能的显着改善。这项工作的主要影响在于其在欧几里得空间中捕获层次特征的能力,因此可以替代双曲线网络而不会损失性能指标,同时利用欧几里得网络的功能,例如可解释性和有效执行各种模型组件。
translated by 谷歌翻译
双曲线网络在涉及各个领域的层次数据集(例如计算机视觉,图形分析和自然语言处理)的几个领域中显示出对其欧几里得对应物的显着改进。但是,由于(i)对加速深度学习硬件的不可易度性,(ii)由于夸张空间关闭而消失的梯度以及(iii)由于本地切线空间和频繁映射而导致的信息丢失,因此它们在实践中的采用仍然受到限制。完全双曲线空间。为了解决这些问题,我们建议使用Taylor系列扩展对双曲线操作员进行近似,这使我们能够将计算昂贵的切线和余弦双曲线功能重新调整为更有效的多项式型号。这使我们能够保留保留双曲线空间层次解剖结构的好处,同时保持对当前加速深度学习基础设施的可伸缩性。多项式配方还使我们能够利用欧几里得网络中的进步,例如梯度剪辑和relu激活,以避免由于经常在切线空间和双曲空间之间进行切换而消除梯度并消除错误。我们对图形分析和计算机视觉域中标准基准测试的经验评估表明,在记忆和时间复杂性方面,我们的多项式公式与欧几里得体系结构一样可扩展,同时提供的结果与双曲线模型一样有效。此外,由于我们解决了消失的梯度和信息丢失,我们的配方还显示出对基线的大幅改进。
translated by 谷歌翻译
Graph Neural Networks (GNNs) have become increasingly important in recent years due to their state-of-the-art performance on many important downstream applications. Existing GNNs have mostly focused on learning a single node representation, despite that a node often exhibits polysemous behavior in different contexts. In this work, we develop a persona-based graph neural network framework called PersonaSAGE that learns multiple persona-based embeddings for each node in the graph. Such disentangled representations are more interpretable and useful than a single embedding. Furthermore, PersonaSAGE learns the appropriate set of persona embeddings for each node in the graph, and every node can have a different number of assigned persona embeddings. The framework is flexible enough and the general design helps in the wide applicability of the learned embeddings to suit the domain. We utilize publicly available benchmark datasets to evaluate our approach and against a variety of baselines. The experiments demonstrate the effectiveness of PersonaSAGE for a variety of important tasks including link prediction where we achieve an average gain of 15% while remaining competitive for node classification. Finally, we also demonstrate the utility of PersonaSAGE with a case study for personalized recommendation of different entity types in a data management platform.
translated by 谷歌翻译
Personalization in Federated Learning (FL) aims to modify a collaboratively trained global model according to each client. Current approaches to personalization in FL are at a coarse granularity, i.e. all the input instances of a client use the same personalized model. This ignores the fact that some instances are more accurately handled by the global model due to better generalizability. To address this challenge, this work proposes Flow, a fine-grained stateless personalized FL approach. Flow creates dynamic personalized models by learning a routing mechanism that determines whether an input instance prefers the local parameters or its global counterpart. Thus, Flow introduces per-instance routing in addition to leveraging per-client personalization to improve accuracies at each client. Further, Flow is stateless which makes it unnecessary for a client to retain its personalized state across FL rounds. This makes Flow practical for large-scale FL settings and friendly to newly joined clients. Evaluations on Stackoverflow, Reddit, and EMNIST datasets demonstrate the superiority in prediction accuracy of Flow over state-of-the-art non-personalized and only per-client personalized approaches to FL.
translated by 谷歌翻译
We present RecD (Recommendation Deduplication), a suite of end-to-end infrastructure optimizations across the Deep Learning Recommendation Model (DLRM) training pipeline. RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets. Feature duplication arises because DLRM datasets are generated from interactions. While each user session can generate multiple training samples, many features' values do not change across these samples. We demonstrate how RecD exploits this property, end-to-end, across a deployed training pipeline. RecD optimizes data generation pipelines to decrease dataset storage and preprocessing resource demands and to maximize duplication within a training batch. RecD introduces a new tensor format, InverseKeyedJaggedTensors (IKJTs), to deduplicate feature values in each batch. We show how DLRM model architectures can leverage IKJTs to drastically increase training throughput. RecD improves the training and preprocessing throughput and storage efficiency by up to 2.49x, 1.79x, and 3.71x, respectively, in an industry-scale DLRM training system.
translated by 谷歌翻译
Despite the huge advancement in knowledge discovery and data mining techniques, the X-ray diffraction (XRD) analysis process has mostly remained untouched and still involves manual investigation, comparison, and verification. Due to the large volume of XRD samples from high-throughput XRD experiments, it has become impossible for domain scientists to process them manually. Recently, they have started leveraging standard clustering techniques, to reduce the XRD pattern representations requiring manual efforts for labeling and verification. Nevertheless, these standard clustering techniques do not handle problem-specific aspects such as peak shifting, adjacent peaks, background noise, and mixed phases; hence, resulting in incorrect composition-phase diagrams that complicate further steps. Here, we leverage data mining techniques along with domain expertise to handle these issues. In this paper, we introduce an incremental phase mapping approach based on binary peak representations using a new threshold based fuzzy dissimilarity measure. The proposed approach first applies an incremental phase computation algorithm on discrete binary peak representation of XRD samples, followed by hierarchical clustering or manual merging of similar pure phases to obtain the final composition-phase diagram. We evaluate our method on the composition space of two ternary alloy systems- Co-Ni-Ta and Co-Ti-Ta. Our results are verified by domain scientists and closely resembles the manually computed ground-truth composition-phase diagrams. The proposed approach takes us closer towards achieving the goal of complete end-to-end automated XRD analysis.
translated by 谷歌翻译
当一家企业向另一家企业(B2B)出售时,购买业务由一组称为帐户的个人代表,他们共同决定是否购买。卖方向每个人做广告,并与他们互动,主要是通过数字方式进行的。销售周期很长,通常在几个月内。在寻求信息时,属于帐户的个人之间存在异质性,因此卖方需要在漫长的视野中对每个人的利益进行评分,以决定必须达到哪些人以及何时达到。此外,购买决定与帐户有关,必须进行评分才能投射购买的可能性,这一决定可能会一直变化,直到实际的决定,象征组决策。我们以动态的方式为帐户及其个人的决定分数。动态评分允许机会在长时间的不同时间点影响不同的单个成员。数据集包含与卖方的每个人通信活动的行为日志;但是,没有关于个人之间咨询的数据,这导致了决定。使用神经网络体系结构,我们提出了几种方法来汇总各个成员活动的信息,以预测该小组的集体决策。多次评估发现了强大的模型性能。
translated by 谷歌翻译
以前的无监督句子嵌入研究集中在数据增强方法上,例如辍学和基于规则的句子转换方法。但是,这些方法限制了控制句子增强观点的细粒语义。这导致监督信号不足以捕获类似句子的语义相似性。在这项工作中,我们发现使用邻居句子可以捕获相似句子之间更准确的语义相似性。基于这一发现,我们提出了RankEncoder,该发现使用了输入句子和语料库中的句子之间的关系来训练无监督的句子编码器。我们从三个角度评估rankencoder:1)语义文本相似性性能,2)相似句子对的功效,以及3)rankencoder的普遍性。实验结果表明,与先前的最新性能相比,Rankencoder达到80.07 \%Spearman的相关性,绝​​对提高了1.1%。在类似的句子对上,改进更加显着,改善了1.73%。另外,我们证明了RankEncoder普遍适用于现有的无监督句子编码器。
translated by 谷歌翻译
学习在线推荐模型的关键挑战之一是时间域移动,这会导致培训与测试数据分布之间的不匹配以及域的概括错误。为了克服,我们建议学习一个未来的梯度生成器,该生成器可以预测培训未来数据分配的梯度信息,以便可以对建议模型进行培训,就像我们能够展望其部署的未来一样。与批处理更新相比,我们的理论表明,所提出的算法达到了较小的时间域概括误差,该误差通过梯度变异项在局部遗憾中衡量。我们通过与各种代表性基线进行比较来证明经验优势。
translated by 谷歌翻译